Results for “Activity of distinct growth factor receptor network components in breast tumors uncovers two biologically relevant subtypes”

## Loading required package: gplots
## 
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
## 
##     lowess
## Loading required package: RColorBrewer
## Loading required package: data.table
## Loading required package: mclust
## Package 'mclust' version 5.1
## Type 'citation("mclust")' for citing this R package in publications.
## Loading required package: ggplot2
## Loading required package: gridExtra
## Warning: package 'gridExtra' was built under R version 3.2.4
## Loading required package: devtools
## Loading required package: grid
## Loading required package: cluster
## Loading required package: plyr
## Loading required package: survival
## 
Read 42.8% of 23368 rows
Read 23368 rows and 1120 (of 1120) columns from 0.361 GB file in 00:00:04

2(A)-(D):Two dominant phenotypes in breast cancer patients and cell lines

2(A): TCGA , all pathways

#2(A): TCGA , all pathways
create_figure2_a_TCGA_allpath <- function(expression_subtypes_df, showLegends=TRUE, ersubset="Full"){
  set.seed(123)
  phenotype.annotation <- data.frame(Phenotype=c("Survival Phenotype",
                                                 "Growth Phenotype",
                                                 "Growth Phenotype",
                                                 "Survival Phenotype",
                                                 "Survival Phenotype",
                                                 "Growth Phenotype",
                                                 "Growth Phenotype"))
  topha <- HeatmapAnnotation(df = phenotype.annotation,
                             col = list(Phenotype = c("Survival Phenotype" =  "coral3",
                                                      "Growth Phenotype" = "aquamarine4")),
                             height = unit(0.333, "cm"))
  maintitle <- "TCGA BRCA"
  if(ersubset != "Full"){
    expression_subtypes_df <- expression_subtypes_df[expression_subtypes_df$ER.Status == ersubset,]
    maintitle <- paste(maintitle,ersubset, sep=", ER ")
  }

  ha_row_tcga2 = HeatmapAnnotation(df = expression_subtypes_df[,10:13],
                                   col = list(PR.Status =   c("Positive" =  "#4DAF4A",
                                                              "Negative" = "#984EA3",
                                                              "Indeterminate" = "black",
                                                              "Unavailable" = "grey"),
                                              HER2.Status = c("Positive" =  "#FFFF33",
                                                              "Negative" = "#F781BF",
                                                              "Indeterminate" = "black",
                                                              "Equivocal" = "skyblue",
                                                              "Unavailable" = "grey"),
                                              ER.Status =   c("Positive" =  "#E41A1C",
                                                              "Negative" = "#377EB8",
                                                              "Indeterminate" = "black",
                                                              "Unavailable" = "grey"),
                                              PAM50 =       c("LumA" = brewer.pal(6, "Dark2")[5],
                                                              "LumB" = '#fccaa4',
                                                              "Her2" = brewer.pal(6, "Dark2")[4],
                                                              "Basal" = '#e41a1c',
                                                              "Normal" = brewer.pal(6, "Dark2")[6],
                                                              "Unavailable"="grey")),
                                   which = "row",
                                   width = unit(1.333, "cm"),
                                   show_legend = showLegends)
  h1 <- Heatmap(expression_subtypes_df[,1:7],
                cluster_rows = T,
                cluster_columns = T,
                show_row_names = F,
                show_column_names = T,
                row_title_gp = gpar(fontsize =10),
                combined_name_fun = NULL,
                top_annotation = topha,
                name="Scaled\nPathway\nActivity",
                column_title = maintitle,
                show_heatmap_legend = showLegends,
                column_dend_reorder = c(1,100,100,10,1,100,100),
                heatmap_legend_param = list(color_bar = "continuous"))
  draw(h1+ha_row_tcga2,row_dend_side = "left", annotation_legend_side = "bottom")
}

#uncomment to create PDF version of 2a
# pdf("Fig2a.pdf", width=6)
# create_figure2_a_TCGA_allpath(single_pathway_best_tcga, showLegends = F)
# dev.off()

create_figure2_a_TCGA_allpath(single_pathway_best_tcga)

(B): ICBP, all pathways

(C): TCGA, k-means of 4 pathways

(D): ICBP, k-means of 4 pathways

Unsupervised variance analysis confirms the significance of phenotypes in breast cancer biology

3(A)-(B):Perform Principal Component Analysis

## [1] "Proportion of variance contributed by first 5 principal components in ICBP gene expression data: 0.427196248976044"
## [1] "Proportion of variance contributed by first 5 principal components in TCGA BRCA gene expression data: 0.343182614266816"

3: Correlate TCGA Principal Components with ASSIGN generated GFRN pathway predictions

4(A)-(E):Survival and growth phenotypes express dichotomous cell survival mechanisms

4(A):Heatmap of breast cancer cell lines from Western Blots

4(C)-(E): Gene and protein expression are significantly different between growth and survival phenotypes

## [1] "733 names have been changed"
## [1] "analyzing BIM"
## [1] "BIM data available"

## [1] "analyzing MCL1"
## [1] "Sorry!This protein is unavailable in TCGA RPPA dataset!"
## [1] 14

## [1] 15

5(A) &(B): Growth factor network phenotypes reflect dichotomous drug response in breast cancer cell lines

5(A): Correlate ICBP GFRN pathway predictions with ICBP drug response and make Heatmap

5(B): Correlate GFRN pathway predictions with drug response in an independent drug assay and make Heatmap

6:Differential drug response identified in GFRN phenotype heterogeneity

#Supplemental Analysis ###Supplemental Figures 3(A)-(H): Protein based pathway validations codes are in ./ASSIGN folder. Gene expression, mutation and IHC-based validations are shown here.

## [1] "757 names have been changed"

### Supplemental Figures 4(A)-(D): Pathway activity estimates between ER+ and ER- samples in breast cancer cell lines and patient data.

Supplemental Figures 5(A)-(G): Boxplot GFRN pathway activity across breast cancer clinical subtypes (IHC)

Supplemental Figures 6(A)-(G): Boxplot GFRN pathway activity across breast cancer intrinsic subtypes (PAM50)

Supplemental Figures 7: Graphical representation of the IHC and intrinsic subtype status distribution for ICBP cell line and TCGA breast tumors.

Supplemental Figure 8:Survival analysis with TCGA BRCA data…

## Warning: NAs introduced by coercion

## Warning: NAs introduced by coercion
## Call:
## survdiff(formula = Surv(tcga_clinicals$time, tcga_clinicals$vital_status) ~ 
##     tcga_clinicals$kmeans.cluster, rho = 1)
## 
##                                                    N Observed Expected
## tcga_clinicals$kmeans.cluster=Growth/BAD high    173     16.9     15.3
## tcga_clinicals$kmeans.cluster=Growth/BAD low     289     13.9     22.6
## tcga_clinicals$kmeans.cluster=Survival/HER2 high 359     33.8     28.3
## tcga_clinicals$kmeans.cluster=Survival/HER2 low  242     18.7     17.1
##                                                  (O-E)^2/E (O-E)^2/V
## tcga_clinicals$kmeans.cluster=Growth/BAD high        0.162     0.238
## tcga_clinicals$kmeans.cluster=Growth/BAD low         3.319     5.282
## tcga_clinicals$kmeans.cluster=Survival/HER2 high     1.080     1.916
## tcga_clinicals$kmeans.cluster=Survival/HER2 low      0.141     0.208
## 
##  Chisq= 5.5  on 3 degrees of freedom, p= 0.141

Supplemental Figure 9(A)&(B): Boxplot TCGA Principal Components across BRCA intrinsic subtypes (PAM50)

## [1] "Spearman correlation of PC1 and mean gene expression of each sample:  -0.786414519142557"

Data for Supplemental Figure 10: Comparison of R2 values (proportion of variance) explained by each model for principle components (PCs) 1 through 5 from TCGA RNA-sequencing breast cancer data.

PC ER ER.kmeans ER.kmeans.pval ER.PAM50 ER.PAM50.pval ER.phenotype ER.phenotype.pval ER.kmeans.PAM50.pval
1 0.0872084 0.1880783 0 0.1313824 0.0016441 0.1051838 0.0081916 NA
2 0.5606260 0.6955323 0 0.7471378 0.0000000 0.6922442 0.0000000 0.00e+00
3 0.0516185 0.3975292 0 0.2538693 0.0000000 0.2522985 0.0000000 NA
4 0.0285084 0.2789351 0 0.0781370 0.0010530 0.2072717 0.0000000 NA
5 0.0379096 0.1749349 0 0.2159659 0.0000000 0.0622188 0.0027064 2.48e-05
PC PR PR.kmeans PR.kmeans.pval PR.PAM50 PR.PAM50.pval PR.phenotype PR.phenotype.pval PR.kmeans.PAM50.pval
1 0.0597998 0.1558167 0 0.1236342 6.03e-05 0.0658990 0.1304072 NA
2 0.4074544 0.6465939 0 0.7356092 0.00e+00 0.6263654 0.0000000 0.0e+00
3 0.0585759 0.3929508 0 0.2532850 0.00e+00 0.2241204 0.0000000 NA
4 0.0040064 0.2818516 0 0.0833947 7.60e-06 0.2130036 0.0000000 NA
5 0.0266319 0.1727616 0 0.2160601 0.00e+00 0.0402403 0.0261066 1.5e-05
PC HER2 HER2.kmeans HER2.kmeans.pval HER2.PAM50 HER2.PAM50.pval HER2.phenotype HER2.phenotype.pval HER2.kmeans.PAM50.pval
1 0.0108250 0.1288564 0 0.1247560 0.0000000 0.0111700 0.7261904 NA
2 0.0000002 0.5087202 0 0.7246049 0.0000000 0.4222629 0.0000000 0.0000000
3 0.0328873 0.3930355 0 0.2570614 0.0000000 0.1150083 0.0000000 NA
4 0.0205140 0.2790501 0 0.0818977 0.0001486 0.2108924 0.0000000 NA
5 0.0234721 0.2073647 0 0.2238067 0.0000000 0.0315990 0.0865439 0.0068758
PC ER.PR.HER2 ER.PR.HER2.kmeans ER.PR.HER2.kmeans.pval ER.PR.HER2.PAM50 ER.PR.HER2.PAM50.pval ER.PR.HER2.phenotype ER.PR.HER2.phenotype.pval ER.PR.HER2.kmeans.PAM50.pval
1 0.0982646 0.1905291 0 0.1329788 0.0084575 0.1129365 0.0166448 NA
2 0.5979224 0.7256479 0 0.7506555 0.0000000 0.7218103 0.0000000 0.0000000
3 0.0914616 0.4040233 0 0.2630988 0.0000000 0.2752829 0.0000000 NA
4 0.0539184 0.2824955 0 0.0890370 0.0105041 0.2169903 0.0000000 NA
5 0.0681934 0.2132864 0 0.2238788 0.0000000 0.1082416 0.0000891 0.0302152
PC kmeans PAM50 kmeans.PAM50 ER.PR.HER2.PAM50.kmeans
1 0.1244270 0.1229359 0.2100966 0.2207230
2 0.4922497 0.7243437 0.7920581 0.8151674
3 0.3845233 0.2489111 0.4695138 0.4784226
4 0.2788131 0.0777884 0.2880172 0.2936144
5 0.1725182 0.2159571 0.2904661 0.3047475

Supplemental Figure 12: Correlations between pathway activation estimates and drug response values between ER+ and ER- and between HER+ and HER2- samples in breast cancer cell lines.

## [1]  17 115
## [1]  31 115
## NULL

## NULL

## NULL

## NULL

### Supplemntal Figure 13: Comparison of Lapatinib sensitivity

## The following `from` values were not present in `x`: Survival/HER2 high, Survival/HER2 low, Growth/BAD high, Growth/BAD low

This analysis was run on Fri Mar 31 02:20:00 2017